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Abstract 



Our previously presented method for high through- 
put computational screening of mutant activity 
(Hediger et al., http://arxiv.org/abs/1203.2950) 
is benchmarked against experimentally measured 
amidase activity for 22 mutants of Candida antarc- 
tica lipase B (CalB). Using an appropriate cutoff 
criterion for the computed barriers, the qualitative 
activity of 15 out of 22 mutants is correctly pre- 
dicted. The method identifies four of the six most 
active mutants with > 3-fold wild type activity and 
seven out of the eight least active mutants with 
<0. 5-fold wild type activity. The method is further 
used to screen all sterically possible (386) double-, 
triple- and quadruple-mutants constructed from the 
most active single mutants. Based on the bench- 
mark test at least 20 new promising mutants are 
identified. 



Introduction 



In industry, one frequently tries to modify an en- 
zyme in order to enhance its functionality in a cer- 
tain way [l}[5] . From an application point of view, 
one of the most interesting questions is how to mod- 
ify an enzyme such that its activity is enhanced 
compared to wild type or such that a new kind 
of activity is introduced into the enzyme |6j. It 
can therefore be of considerable relevance to have a 
method available which efficiently allows to a pri- 
ori discriminate between promising candidates for 
experimental study and mutants which can be ex- 
cluded from the study. 

Numerous methods are currently being proposed 
and developed for the description of enzyme activ- 
ities, the theoretical background of which ranges 
from phenomenological approaches [7 TO to quan- 



curacy, aim at being used in parallel or prior to 



experimental work 20 21 and are not designed to 



turn mechanics based ab initio descriptions 11 - 19 



However one can expect that methods which are 
highly demanding in terms of set-up efforts and 
computational time are less likely to be employed 
in industrial contexts where qualitative or semi- 
quantitative conclusions can be of sufficient use 
in the beginning and planning phase of a wet-lab 
study. Few approaches, while taking into account 
a number of approximations and limitations in ac- 



be used for high throughput fashion. 
Hediger et al. have recently published a computa- 
tional method for high throughput computational 
screening of mutant activity |22| and in this paper 
we benchmark the method against experimentally 
measured amidase activity for mutants of Candida 
antarctica lipase B (CalB) and apply the method 
to identify additional promising mutants. 



Methods 



We introduce the experimental set-up and the 
methodology for comparing experimental and com- 
putational data. We describe a benchmarking and 
a combinatorial study of CalB mutant activity. 
Experimentally, variants of Candida Antarctica li- 
pase B (CalB) were either produced in Pichia pas- 
toris with C-terminal His6-tag for subsequent affin- 
ity purification or expressed in Aspergillus oryzae 
without terminal tag followed by a three-step pu- 
rification procedure. 

It is generally accepted that in serine protease like 
enzymes, the formation of the tetrahedral interme- 
diate (TI, Fig. [T]) is rate determining 23 -26 and 
throughout this work we assume that a lower bar- 
rier for this reaction correlates to increased overall 
activity of the enzyme. 

The substrate used throughout this study is N- 
benzyl-2-chloroacetamide. The organisms used for 
expression of the individual variants are indicated 
in Tabled) 



Generation of CalB Variants without His- 
tags 

Variants of CalB carrying the CalB signal pep- 
tide were generated at the DNA level using 
QuickChange mutagenesis on the corresponding 
gene residing in a dual E.coli / Aspergillus Pichia 
pastoris expression vector. The PCR was per- 
formed with proofreading DNA polymerase (New 
England Biolabs, NEB). To remove parent tem- 
plates, they were methylated in vitro prior to PCR 
with CpG methyltransferase (from NEB) and di- 
gested in vivo after transformation of competent 
E.coli DH5 a cells (TaKaRa) according to the in- 
structions from the manufacturer. Plasmid DNA 



3 



was isolated from transformed E.coli strains, and 
sequenced to verify the presence of the desired sub- 
stitutions. Confirmed plasmid variants were used to 
transform an Aspergillus oryzae strain that is neg- 
ative in pyrG (orotidine-5'-phosphate decarboxy- 
lase), proteases pepC (aserine protease homologous 
to yscB), alp (an alkaline protease) Npl (a neutral 
metalloprotease I) to avoid degradation of the li- 
pase variants during and after fermentation. 
The transformed Aspergillus strains were fermented 
as submerged culture in shake flasks and the lipase 
variants secreted into the fermentation medium. 
After the fermentation, the lipase variants were pu- 
rified from the sterile filtered fermentation medium 
in a 3 step procedure with 1) hydrophobic inter- 
action chromatography on decylamine-agarose, 2) 
buffer exchange by gel filtration and 3) ion ex- 
change chromatography with cation exchange on 
SP-sepharose at pH 4.5. The lipase variant solu- 
tions were stored frozen. 

Generation of CalB Variants with His-tags 

Variants of CalB carrying the CalB signal pep- 
tide and C-terminal His-tags were generated at the 
DNA level using SOE-PCR and inserted into a dual 
E.coli/ Pichia pastoris expression vector using In- 
fusion cloning (ClonTech). The SOE-PCR was per- 
formed with Phusion DNA polymerase (NEB) and 
template DNA of the CalB gene. The cloned plas- 
mids were transformed in competent E.coli DH5 a 
cells (TaKaRa). Plasmid DNA was isolated from 
transformed E.coli strains, and sequenced to ver- 
ify the presence of the desired substitutions. Con- 
firmed plasmid variants were used to transform a 
Pichia pastoris strain that is Mut(s), Suc(+), His(- 
). The transformed Pichia strains were fermented 
as submerged culture in deep well plates and se- 
cretion of the lipase variants into the fermentation 
medium was induced by addition of methanol. Af- 
ter the fermentation, the lipase variants were puri- 
fied from the cleared supernants using a standard 
His-tag purification protocol (Qiagen) and buffcr- 
exhanged into 50 mM phosphate buffer, pH 7.0, us- 
ing Amicon Ultra centrifugal filter devices with a 
10 kDa cutoff (Merck Millipore). 

Activity Measurement 

Amidase activity of CalB variants was determined 
in a two-step fluorimetric assay previously described 



sis of N-benzyl-2-chloroacetamide was performed 
in 96-well microtiter plates in 200 /iL phosphate- 
buffered aqueous solution pH 7.0 including 10% or- 
ganic co-solvent (THF or DMSO). Reactions con- 
taining 5 mM amide substrate, 0.3-3 /zM enzyme, 
and 12 /ig/mL BSA were incubated for 18-20 h at 
37°C in a shaker incubator. In a second step, 50 /iL 
of a 20 mM 4-nitro-7-chloro-benzo-2-oxa-l,3-diazole 
(NBD-C1) solution in 1-hexanol was added and the 
reaction of NBD-C1 with benzylamine formed dur- 
ing amide hydrolysis proceeded under identical re- 
action conditions for another hour. 
Fluorescence of the final reaction product was de- 
termined with excitation at 485 nm and measured 
emission at 538 nm. Calibration of the amide hy- 
drolysis reaction was performed on each assay plate 
with benzylamine covering a concentration range 
between 0.05 and 5 mM. All enzymatic activities 
were corrected for non-enzymatic background reac- 
tion determined under identical conditions without 
enzyme present. 

Compuational Details 

The computational method used to estimate the re- 
action barriers of the CalB mutants has been de- 



scribed in detail earlier 22 and is only summarized 
here. 

The reaction barriers are estimated computation- 
ally by preparing molecular model structures |22| 
(consisting of around 840 atoms) of the enzyme sub- 
strate complex (ES) and the tetrahedral interme- 
diate (TI) inbetween which linear interpolation is 
carried out to generate structures of the enzyme 
on the reaction path. The geometry of each inter- 
polation frame is optimized while keeping the dis- 
tance between the nucleophilic carbon C 20 of the 
substrate and O 7 of serine 105 (Fig. [IJ fixed at a 
specific value di = di n i — i(di n i — dfi n )/ 10, where 
di n i and are the distances between C 20 and O 7 
in the ES complex and TI, respectively (in A, 10 
being the number of interpolation frames and i the 
interpolation frame index). In geometry optimiza- 
tion calculations, the gradient convergence criteria 
is set to 0.5 kcal/(molA) and a linear scaling im- 
plementation of the PM6 method (MOZYME [28] ) 
together with a NDDO cutoff of 15 A is applied. 
The energy profile of the reaction barrier at the 
PM6 level of theory 29 is subsequently mapped 



by Henke et al 27 . First, enzymatic hydroly- 



out by carrying out conventional SCF calculations 
of each optimized interpolation frame. All calcula- 
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tions are carried out using the MOPAC suite of pro- 
The molecular models are based on 



30 31 



grams 

the crystal structure of the CalB enzyme with PDB 
identifier 1LBS 
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In order to prevent significant 
rearrangement of hydrogen bonding network of sur- 
face residues during the optimization, a number of 
additional structural constraints arc applied in the 
geometry optimizations, i.e. the residues S50, P133, 
Q156, L277 and P280 are kept fixed. These (sur- 
face) residues are observed to rearrange and form 
new hydrogen bonds in optimizations when no con- 
straints are applied. Omitting the constraints leads 
to unconclusive barrier shapes containing many ir- 
regular minima along the reaction coordinate which 
do not permit to readily define a reaction barrier. 
For the analysis, the reaction barrier is defined by 
the difference between the highest energy point on 
the reaction profile and the energy corresponding 
to the enzyme substrate complex. From our calcu- 
lations (PM6//MOZYME in vacuum), we estimate 
the wild type (WT) barrier to be 7.5 kcal/mol. 
Experimentally, specific activity of hydrolysis is de- 
termined. Given first order kinetics, saturation of 
the enzyme with substrate and fast binding and 
product release, the catalytic rate constant k cat is 
directly proportional to the specific activity under 
the assumption that the amount of active enzyme 
remains constant. This therefore allows the cat- 
alytic rate constant k cat and, hence, the barrier 
height to be compared to the improvement factors 
reported in the results section. The approximations 
used here in relating the barrier height on the po- 
tential energy surface to k cat have been discussed 



previously 22 



It is noted that using one CPU per interpolation 
frame on the reaction barrier, the complete barrier 
of one mutant can be computed with 10 CPUs usu- 
ally within less than 12 hours of wall clock time (for 
a molecular model of the size used in this study). 
Given a set of molecular models of the enzyme, and 
100 available CPUs, it is possible to screen around 
1000 mutants within one week. 

Combination Mutants 

The molecular model of the enzyme and the posi- 
tions of the point mutations in the enzyme are illus- 
trated in Fig. [2j The point mutations are listed in 
Table [2j Two sets of mutants are introduced in this 
section. A benchmarking set S and a combinatorial 
set L, the definitions of which are provided in the 



following. 

The point mutations are selected based on differ- 
ent design principles. These are either introduction 
of structural rearrangements in the active site to 
change the binding site properties of the active site 
(residues P38, G39, G41 T42 T103) [l], introduc- 
tion of space to accomodate the substrate (W104, 
L278, A282, 1285, V286), introduction of dipolar 
interactions between the enzyme and the substrate 
(A132, A141, 1189) [33] or reduction of polarity in 
the active site (D223) . The mutants of the bench- 
marking study are collected in a small set S (22 mu- 
tants, Table [T]). For the combinatorial study, out of 
the above we select six residues (G39, T103, W104, 
A141, 1189, L278) which, it is assumed, contribute 
strongest to increased activity and define the mu- 
tations at each position as listed in Table [3) Given 
the position i and the number of mutations at each 
position gi, in general the upper limit for the num- 
ber of mutants M in a combinatorial study can be 
calculated by writing a sum term for each type (i.e. 
"order") of combination mutant, i.e. single, dou- 
ble, . . . , such that 

M = 9i + 9i '9j+ Y Si- 9 3 ■ 9k + ■ ■ ■ 



Single 

0=1) 



3>i 

Double 

(o=2) 



i,j,k 
k>j>i 



Triple 

(o=3) 



where each sum term consists of (^) individual 
terms (N and o being the number of positions which 
can be mutated and the order of the mutant, re- 
spectively). By this scheme, considering the mu- 
tations listed in Table [3j hypothetically 424 (= 13 
+ 64 + 154 + 193) single to four-fold mutants can 
be constructed. This number is reduced by apply- 
ing the restriction that out of the 424 hypotheti- 
cally possible mutants, single, 2 double, 12 triple 
and 24 four-fold combination mutants including the 
pair A141N/Q-I189Y are discarded because in the 
molecular modeling, these side chains could not be 
allocated spatially in the same mutant. We further 
note that 15 out of these remaining 386 mutants 
(Table [3| are present also in the benchmarking set 
S and thus the combinatorial study consists of 371 
unique mutants. A detailed documentation of the 
number of screened residues in the combinatorial 
study is provided in Table |4j 

Prior to analysis, the reaction barriers of the com- 
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bination mutants are inspected visually and mu- 
tants with irregularly shaped barriers, i.e. con- 
sisting of multiple peaks of similar height along 
the reaction coordinate, are discarded. Further- 
more, out of the mutants with regular reaction bar- 
rier shapes, we discard those mutants with barriers 
>19.0 kcal/mol (i.e. the largest calculated barrier 
from set S). Following these selection criteria, 61 
mutants are discarded because of inconclusive bar- 
rier shapes and 47 mutants because the barrier is 
higher than 19 kcal/mol (a distribution of reaction 
barriers is shown in Fig. SI). After these filter- 



ing steps, 278 mutants remain in the combinatorial 
study which we collect in the large set L (out of 
which 15 are in set S). An overview on the distri- 
bution of reaction barriers for the mutants from set 
L is provided Fig. |S2| of the supporting informa- 
tion. 

We note that in set S, all barriers appear regular 
in shape and no mutant contains the A141N/Q and 
I189Y pair. 



Results and Discussion 



Set S 

The correspondance of the computed barriers from 
set S with the experimental assay is shown in Fig. 
[3j The exact data is reported in Table [T] A scat- 
terplot of calculated reaction barriers is presented 
in Fig. [4] 

We note that in set S, the highest experimen- 
tally observed activity is around 11 times the wild 
type activity (G39A-T103G-W104F-L278A, Table 
[lj, while roughly ten mutants show no increased 
activity. In total, six mutants show 3-fold or higher 
wild type activity. In the calculations, only one 
mutant is observed to have a lower barrier than 
the wild type (7.3 kcal/mol, G39A-T103G-L278A) 
and the highest observed barrier is 18.9 kcal/mol 
(I189G). 

Given the approximations introduced to make the 
method sufficiently efficient, it is noted that the in- 
tent of the method is not a quantitative ranking of 
the reaction barriers, but to identify promising mu- 
tants for, and to eliminate non-promising mutants 
from, experimental consideration. Therefore only 
qualitative changes in overall activity are consid- 
ered. 



We categorize the experimentally observed activi- 
ties and the predicted reaction barriers as follows. 
From experiment, a mutant with activity of 1.2 
(0.8) times the wild type activity or higher (lower) is 
considered as improving (degrading). Correspond- 
ingly, the computed difference in reaction barrier 
height between a mutant and the wild type is ex- 
pressed in qualitative terms. For the comparison 
with the experimental activity assay, we define a 
barrier cutoff cs = 12.5 kcal/mol to distinguish be- 
tween potentially improving and degrading mutants 
in set S. 

A mutant with a predicted barrier >C5 (12.5 
kcal/mol) is considered to likely have decreased ac- 
tivity compared to the wild type while mutants with 
reaction barriers <cs are considered likely having 
increased activity. 

We note that defining the cutoff is done purely for 
a post hoc comparison of experimental and com- 
puted data. When using the computed barriers to 
identify promising experimental mutants, one sim- 
ply chooses the N mutants with the lowest barriers, 
where N is the number of mutants affordable to do 
experimentally (e.g. 20 in the discussion of set L). 
Based on this approach, qualitative activity of 15 
out of 22 mutants is correctly predicted. It is noted 
that the correlation is best for mutants with largest 
activity difference compared to wild type (both pos- 
itive or negative). For example the method identi- 
fies four of the six most active mutants with > 3-fold 
wild type activity. Similarly, the method identi- 
fies seven out of the eight least active mutants with 
<0. 5-fold wild type activity. For mutants with only 
small differences in activity compared to wild type, 
the predictions are less accurate. 

Set L 

Set L is screened to identify new mutants for 
which increased activity is predicted. The 20 
mutants with the lowest barriers are suggested as 
candidates for further experimental study in Table 
[5] The distributions of reaction barriers, resolved 
by mutations at positions 104 and 189, are shown 
in Figs. [5]^ and B. 

In set L, three new mutants are identified with 
barriers lower than the predicted wild type barrier. 
Out of the 20 mutants suggested in Table [5j three 
are double mutants, seven are three-fold and ten 
are four-fold mutants. No single mutants where 
found for which increased activity compared to 
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wild type is predicted. All mutants except one con- 
tain the G39A mutation, five contain the T103G 
mutation, six contain a mutation of Wf04, 13 
contain a mutation of A141, 16 contain a mutation 
of 1189 and eight contain the L278A mutation. 
From this observation it is likely that mutations 
of G39, A141 and 1189 will likely contribute to an 
increased activity of the mutant and should thus 
be included in future experimental activity assays. 
Set L is further analysed in terms of the effect of 
the mutations at the positions 104 and 189. For the 
mutations of W104, we note that single mutations 
which give rise to relatively high barriers (W104Q, 
W104Y, Fig. [5]^) can have significantly lower 
barriers in combination with other mutations. For 
example, out of the sixty mutants with lowest 



barriers (Fig. S3 1, 33 contain a mutation of W104 



out of which 17 are suggested to be W104F, 
while 14 are suggested to be W104Y (two contain 
W104Q). 

The mutation of 1189 is analysed in a similar way. 
In set L, five different mutations of this residue are 
screened (Table pj]). The single mutant with the 
lowest barrier is I189Y and the two mutants with 
the lowest predicted barrier contain this mutation 
as well (Table [5]). Similarly to above, higher order 
mutants containing I189A, I189G, I189H or I189N 
are predicted to have considerably lower barriers 
than the corresponding single mutants, Fig. [5}3. 
Particularly, out of the mutants listed in Table 
[5j three contain the I189A, one contains I189G 
mutation, four contain the I189H mutation and 
three contain the I189N mutation. 
As a special case we highlight that the single 
mutant I189G has one of the highest calculated 
barriers (18.9 kcal/mol, Table [I]), however the 
four-fold mutant G39A-A141Q-I189G-L278A has 
one of the lowest barriers (6.3 kcal/mol, Table 
[5]). Interestingly, the mutant G39A-A141Q-L278A 
has an intermediate barrier (10.9 kcal/mol). It 
would appear that I189G as a single mutant is 
counterproductive (high computed barrier) but 
lowers the barrier of G39A-A141Q-L278A. 
Observations as these should be kept in mind when 
selecting the single mutants to be considered when 
preparing higher order mutants. 



Conclusions 



Our previously presented method for high through- 
put computational screening of mutant activity |22| 
is benchmarked against experimentally measured 
amidase activity for 22 mutants of Candida antarc- 
tica lipase B (CalB). 

Experimentally, amidase activity is successfully in- 
troduced in 12 mutants, the highest activity is de- 
termined to be 11.2 -fold over the wild type activity. 
Using an appropriate cutoff criterion for the com- 
puted barriers, the qualitative activity of 15 out of 
22 mutants is correctly predicted. It is noted that 
the correlation is best for mutants with largest ac- 
tivity difference compared to wild type (both posi- 
tive and negative). For example the method identi- 
fies four of the six most active mutants with > 3-fold 
wild type activity. Similarly, the method identi- 
fies seven out of the eight least active mutants with 
<0. 5-fold wild type activity. 

Thus validated, the computational method is used 
to screen all sterically possible (386) double-, triple- 
and quadrupole-mutants constructed from the most 
active single mutants. Based on the benchmark test 
at least 20 new promising mutants are identified. 
Interestingly, we observe that single mutants that 
are predicted to have low activity appear to have 
high activity in combination with other mutants. 
This is illustrated in specific analysis of effects of 
mutations of two different positions (104 and 189). 
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Figure 1. Reaction scheme for the formation of TI. Nucleophilic attack by O 7 of S105 on 
carbonyl carbon C 20 of substrate. R4: -CH 2 -C1, R 2 : -CH 2 -C 5 H 6 . 
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Figure 2. Positions of point mutations. A: Overlay of mutations W104[F, Q, Y]. B: Overlay of 
mutations of A141[N, Q]. C: Overlay of mutations of I189[A, G, H, N, Y]. D: Mutations P38H, G39A, 
G41S, T42A, T103G, A132N, L278A, A282G, I285A and V286A. Substrate shown in magenta. 
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Figure 3. Comparison of experimental and computed activities. 1/-1 correspond to 
increased/decreased overall activity, respectively. Prediction rate is 15/22 (68%). 
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Figure 4. Barrier scatter plot of set S. 22 mutants; The cutoff value cs is discussed in the text. 
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Figure 5. Barrier scatter plots of set L. In both panels, the labels indicate mutants containing the 
labeled and possibly additional mutations up to the indicated order. "OTHER" indicates a mutant not 
containing any of the labeled mutations or of higher than 4. order. A: Mutations of W104. B: 
Mutations of 1189. 
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Tables 



Table 1. Experimental overall activities and calculated reaction barriers of Set S. Category 
(Cat.) +1/-1 indicates increased/decreased overall activity. Category definition is discussed in text. Ao 
and Pp indicating expression in organisms (Org.) Aspergillus oryzae or Pichia pastoris, respectively. 



Species 




Exp. 

Activity 
[*WT] 


Cat . 


Calc . 
Barriers 
Lkcal/molJ 


Cat . 


Org. 


G39A-T103G-W104F-L278A 




11.2 


1 


13.9 


-1 


Ao 


G39A-L278A 




v 
I . 


U 


1 


11.3 


1 


Pp 


G39A-W104F 




4. 


2 


1 


10.6 


1 


Ao 


G39A-T103G-L278A 




3, 


8 


1 


7.3 


1 


Ao 


G39A-W104F-L278A 




Q 

O , 


a 

D 


1 


11.8 


1 


Pp 


T103G 




3, 





1 


13.6 


-1 


Ao 


G39A-W104F-I189Y-L278A 




2, 


9 


1 


10.9 


1 


Pp 


G39A 




2. 


8 


1 


11.2 


1 


Ao 


T OVO A 

(OA 




2. 


5 


1 




-1 


A ~ 

AO 


W104F 




2. 





1 


12.0 


1 


Ao 


G39A-T103G-W104Q-L278A 




1. 


9 


1 


12.8 


-1 


Ao 


G39A-T103G-W104F-D223G-L278A 




1. 


5 


1 


11.3 


1 


Pp 


G39A-T103G 




0. 


8 


-1 


7.5 


1 


Ao 


G39A-T42A-T103G-W104F-L278A 




0. 


7 


-1 


10.4 


1 


Pp 


I189H 




0. 


5 


-1 


12.9 


-1 


Pp 


G39A-I189G-L278A 




0, 


4 


-1 


10.7 


1 


Pp 


G41S 




0, 


3 


-1 


13.4 


-1 


Pp 


I189G 




0, 


2 


-1 


18.9 


-1 


Pp 


G39A-T103G-W104F-I189H-D223G- 


-L278A 


0, 


1 


-1 


13.7 


-1 


Pp 


G39A-T103G-W104F-I189H-L278A- 


-A282G-I285A-V286A 


0, 


.1 


-1 


12.9 


-1 


Pp 


A132N 




0, 





-1 


12.5 


-1 


Pp 


P38H 




0. 





-1 


12.5 


-1 


Pp 


WT 




1. 







7.5 
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Table 2. Point mutations. The term active site refers to residues with potential direct Van der 
Waals contact to the substrate. The term first shell / second shell refers to residues which are adjecent 
to an active site/first shell residue. 



Target 


Mutations 


Type 


Description 


P38 


H 


Second shell 


(H neutral) 


G39 


A 


First shell 




G41 


S 


First shell 




T42 


A 


Second shell 




T103 


G 


First shell 




W104 


F, Q, Y 


Active site 




A132 


N 


First shell 




A141 


N, Q 


Active site 




1189 


A, G, H, N , Y 


Active site 


(G including additional water, 








H neutral) 


D223 


G 


First shell 


(Increase of charge by +1) 


L278 


A 


Active site 




A282 


G 


Active site 




1285 


A 


Active site 




V286 


A 


First shell 





1G 



Table 3. Side chains used for generation of combinatorial set L. i and gt indicate the position 
in the back bone and the number of mutations at that position, respectively. 



Mutation 


i 


9i 


G39A 


39 


1 


T103G 


103 


1 


W104{F, Q, Y} 


104 


3 


A141{N, Q} 


141 


2 


I189{A, G, H, N, Y} 


189 


5 


L278A 


278 


1 
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Table 4. Combinatorial study details. From the possible mutants, the combinations containing the 
pair A141N/Q-I189Y, the mutants with inconclusive barriers and the mutants with barriers 
>19.0 kcal/mol are subtracted to give the number of mutants in set L. "Only Set L" indicates the 
number of mutants uniquely present in set L and not in set S. 



Order 


Possible 


Containing 


Inconclusive 


Barrier >19.0 


Set L 


Only 






A141N/Q-I189Y 


barrier 


[kcal/mol] 




Set L 


Single 


13 











13 


7 


Double 


64 


2 


4 


8 


50 


47 


Triple 


154 


12 


21 


20 


101 


98 


Four-fold 


193 


24 


36 


19 


114 


111 


Total 


424 


38 


61 


47 


278 


263 



18 



Table 5. Selection of mutants from set L with lowest barriers. 



Mutation Barrier [kcal/mol] 



G39A-T103G-I189Y 




5 


.7 


G39A-I189Y 




6. 


.2 


G39A-A141Q-I189G- 


-L278A 


6. 


.3 


G39A-A141N-L278A 




7. 


.6 


G39A-A141N 




7. 


.7 


G39A-A141N-I189H- 


-L278A 


8. 


.3 


G39A-W104F-A141Q- 


-I189A 


8. 


,3 


G39A-A141Q-I189N 




9. 


, 1 


G39A-A141N-I189N 




9, 


.3 


G39A-T103G-W104Y- 


-A141N 


9 


.8 


G39A-W104Y-I189Y 




9, 


.8 


G39A-A141N-I189N- 


-L278A 


10. 


.1 


G39A-W104F-A141N 




10, 


.1 


G39A-I189H-L278A 




10. 


.2 


G39A-A141N-I189A- 


-L278A 


10. 


.2 


W104Y-I189H 




10. 


.4 


G39A-T103G-W104F- 


-I189Y 


10. 


.4 


G39A-A141Q-I189A- 


-L278A 


10. 


.4 


G39A-T103G-A141Q- 


-I189H 


10, 


.4 


G39A-T103G-I189A- 


-L278A 


10, 


.5 
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Figure SI. Discarded mutants with barriers > 19.0 kcal/mol. Single mutants: 0; Double 
mutants: 8; Triple mutants: 20; Four-fold mutants: 19; Total: 47. 
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Figure S2. Combination mutants from set L. Single: 13; Double mutants: 50; Triple mutants: 
101; Four-fold mutants: 114: Total: 278. 
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Figure S3. Low-barrier mutants. 33 out of 60 contain a mutation of W104 (W104F: 17, W104Y: 
14, W104Q: 2). 



